An Efficient Mechanism for Stemming and Tagging: The Case of Greek Language

نویسندگان

  • Giorgos Adam
  • Konstantinos Asimakis
  • Christos Bouras
  • Vassilis Poulopoulos
چکیده

In an era that, searching the WWW for information becomes a tedious task, it is obvious that mainly search engines and other data mining mechanisms need to be enhanced with characteristics such as NLP in order to better analyze and recognize user queries and fetch data. We present an efficient mechanism for stemming and tagging for the Greek language. Our system is constructed in such a way that can be easily adapted to any existing system and support it with recognition and analysis of Greek words. We examine the accuracy of the system and its ability to support peRSSonal a medium constructed for offering meta-portal news services to internet users. We present experimental evaluation of the system compared to already existing stemmers and taggers of the Greek language and we prove the higher efficiency and quality of results

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Discursive Construction of Ethnic Identities: The Case of Greek-Cypriot Students

This study examines how Greek-Cypriot students aged 12 to 18, an understudied group of students, construct their ethnic identity in a complex setting such as Cyprus and what motivates the students in the selection of ethnic identity labels. The choice to focus on students aged 12-18 was made on the hypothesis that young children, who did not experience the 1974 war in Cyprus, may have a differe...

متن کامل

سیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی

Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Joint PoS Tagging and Stemming for Agglutinative Languages

The number of word forms in agglutinative languages is theoretically infinite and this variety in word forms introduces sparsity in many natural language processing tasks. Part-of-speech tagging (PoS tagging) is one of these tasks that often suffers from sparsity. In this paper, we present an unsupervised Bayesian model using Hidden Markov Models (HMMs) for joint PoS tagging and stemming for ag...

متن کامل

Greek-English Cross Language Retrieval of Medical Information

Health information systems on the web basically support the English language. To access high-quality online health information it is frequently a barrier for non-English speakers or speakers of English as a foreign language. In this work we present a cross-language retrieval system to support Greek users in the medical domain, overcome the language barrier. We have performed a case study on the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010